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Abstract: A new measurement method for the spatial distribution of neutron beam flux in boron neutron 
capture therapy (BNCT) is being developed based on the two-dimensional Micromegas detector. To address 
the issue of long processing time in traditional offline position reconstruction methods, this paper proposes an 
FPGA-based online position reconstruction method, grounded in the micro time projection chamber principle. 
This method encapsulates key technical aspects: self-adaptive serial link technique built upon the dynamical 
adjustment of delay chain length, fast sorting and coordinate matching technique based on the mapping between 
signal timestamps and random access memory (RAM) addresses, and precise start point merging technique 
utilizing a circular combined RAM. The performance test of the self-adaptive serial link shows the bit error 
rate of the link is better than 10°!’ at a confidence level of 99%, ensuring reliable data transmission. The 
combined experiment of the readout electronics and the Micromegas detector shows a spatial resolution of 
approximately 1.4 mm, surpassing the current method's resolution level of 5 mm. The beam experiment 
confirms that the readout electronics system can obtain the flux spatial distribution of neutron beam online, 
thus validating the feasibility of the position reconstruction method. The online position reconstruction method 
avoids traditional methods such as bubble sorting and traversal searching, simplifying the design of logic 
firmware and reducing the time complexity from O(n’) to O(n). This study contributes to the advancement in 
measuring neutron beam flux for BNCT. 
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1. Introduction 


Boron Neutron Capture Therapy (BNCT) is a progressive radiation therapy technology which could 
selectively eliminate tumors at cell level [1-2]. With advancements in accelerator-based neutron sources in 
recent years, BNCT has overcome the previous nuclear safety risk associated with traditional reactor-based 


neutron sources [3-5]. This progress is increasingly fostering prospects for its clinical application in hospitals. 


The spatial distribution of neutron beam flux is a critical parameter in the physical and clinical dosimetry of 
BNCT, fundamentally influencing the precision of radiation dosage. However, there is currently no 
internationally standardized measurement method for this parameter. Traditional methods such as point-by- 
point scanning with active detectors or spatial arrangement of passive detectors, suffer from drawbacks such as 
long measurement time, complex operations, and low accuracy [3-9]. 

The Micromegas (Micro-Mesh-Gaseous-Structure) detector is a two-dimensional micropattern gaseous 
detector [10]. It offers fast two-dimensional beam imaging via a single measurement [11]. In this study, 
Micromegas detectors are adopted to measure the spatial distribution of neutron beam flux in BNCT. The 
measurement system consists of the Micromegas detector and readout electronics is depicted in Fig.1 [12-13]. 
The Micromegas detector uses a ™ LiF film as the converter, initially engaging in the °Li(n, «)°H reaction with 
neutrons. The resultant charged particles undergo primary ionization and sequential avalanche amplification in 
the drift and amplification gaps, respectively. Finally, current signals are induced on the anode readout strips. 
The utilized Micromegas detector is equipped with 192 readout strips both in the X-direction and Y-direction. 
The 384 readout strips are connected to the readout electronics system through several micro coaxial flat cables. 
The readout electronics are designed as a modular structure in the PXI Express (PXIe) platform, including 12 
modular data acquisition boards (DABs) and 1 clock command board (CCB) [13]. The DABs amplify, digitize, 
and shape the current signals from 384 detector strips. The CCB is located in the system timing slot and 
distributes the system clock, start and stop acquisition command to 12 DABs via star buses of the chassis 


backplane. 
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Fig.1 The measurement system consists of the Micromegas detector and readout electronics. 

The rapid acquisition of measurement results is important for improving equipment practicality and 
measurement efficiency. Although the Micromegas detector is utilized to shorten the data acquisition process, 
data analysis process can still be time-consuming. This is because conventional readout electronics systems 
typically only handle data acquisition, with the collected data being analyzed offline through software [14-18]. 
Several minutes to tens of minutes are normally taken to analyze the large volume of data, which does not meet 
user requirements. The major time-consuming part of offline software data processing is particle position 
reconstruction. To address the issue, this study utilizes the parallel computing characteristic and real-time 
capabilities of FPGA (Field-Programmable Gate Array) to perform online particle position reconstruction. 

Two particle position reconstruction principles are commonly employed when the Micromegas detector is 
used for particle position measurement: the charge centroid principle and the uTPC (micro time projection 
chamber) principle [19-21]. The charge centroid principle reconstructs the hit position by calculating the 
average of the strip positions weighted by the strip charge. The uTPC principle treats the Micromegas detector 
as a miniature time projection chamber. It combines a group of signals caused by a single incident particle, and 
merges them into the signal with the latest arrival time within the group. The readout strip corresponding to 
the signal with the latest arrival time is considered as the entry point of the particle [19,20]. This process is 
called start point mergence in this study. 

The pn TPC principle has a simpler calculation process if it is implemented in FPGA compared to the charge 


centroid principle. Furthermore, multiple studies have shown that the charge centroid principle is suitable for 


cases where charged particles are approximately emitted vertically, and its accuracy of position reconstruction 
rapidly deteriorates as the emission angle increases. In contrast, the pTPC principle exhibits excellent position 
reconstruction accuracy across the entire angular range [19,20]. The secondary charged particles resulting from 
°Li(n, a)°H neutron interactions are randomly emitted at a 4m solid angle. According to these considerations, 
the np TPC principle is chosen for neutron position reconstruction in this study. 

According to the TPC principle, the particle position reconstruction involves operations such as data 
aggregation from multiple DABs, signal time sorting, start point merging, and XY coordinate matching. Due 
to the high neutron flux, data acquisition only takes several seconds. After data acquisition is completed, the 
particle position reconstruction process is initiated, and the reconstructed data are uploaded from the CCB to 
the chassis controller. 

Computational software for general computers often includes well-developed sorting and searching 
algorithms. However, these algorithms are not suitable for direct porting to FPGA. On the one hand, FPGA has 
limited on-chip memory resources compared to general computers, and its underlying architecture differs from 
that of CPUs (central processing units). Therefore, the FPGA is not proficient in sorting and searching 
operations involving large amounts of data. Implementing these software algorithms directly on an FPGA 
would lead to complex firmware design. On the other hand, the algorithms provided by conventional software 
often have high time complexity. For instance, typical algorithms like bubble sorting and traverse search have 
a time complexity of O(n’) [22]. If these software algorithms are directly ported to an FPGA, they may not 
fully leverage the hardware acceleration advantages of FPGA. 

To sum up, the demand for rapidly obtaining measurement results poses a challenge to the readout 
electronics system in terms of FPGA-based online particle position reconstruction. This paper will fully utilize 


the characteristics of FPGA and propose a hardware-friendly online position reconstruction method. 
2. Online hardware position reconstruction method 


2.1 Data aggregation built on the self-adaptive serial link technique 


The primary step in aggregating data is to establish the transmission link between the DABs and the CCB. 
The CCB serves as the main site for online position reconstruction. The DABs and the CCB are connected via 
serial star buses. The clock phase at both ends may vary during different power-on period, which poses a 
challenge for stable data transmission over the serial link. To address this issue, this study proposes a self- 
adaptive serial link by dynamically adjusting the delay chain length. The composition of the link is illustrated 
in Fig.2 (a). 

The data aggregation link works in full-duplex mode. Read requests are transmitted as high level voltage 
pulses via PXIe DSTARB star bus and the data are aggregated via PXIe DSTARC star bus [23]. The 
OSERDES (output serializer/deserializer) and corresponding ISERDES (input serializer/deserializer) modules 
in Fig.2 (a) are used for data serialization and deserialization, respectively [24]. IBUFDS (differential input 
buffer) and OBUFDS (differential output buffer) modules are used for single-ended to differential voltage level 
conversion. IDELAY (input delay resources) chains are programmable, and play a core role in the self-adaptive 


serial link. They are located on the input path of the FPGA pins and each includes 64 delay elements [25]. By 


changing the length of the delay chains at the receiver end, the phase relationship between the data and the 


clock can be adjusted, ensuring proper sampling of the data by the clock. 


Since this study adopts a strategy of collecting data first and then performing position reconstruction, there 


is no special requirement for the link rate. This study opts for a clock with a frequency of 62.5 MHz, half of the 


FPGA device clock. Thus, the corresponding serial aggregation link rate is 625 Mbps. 


Before the DABs can send parallel data to the CCB, an initialization process is required to establish the 


connection of the link. The initialization process involves the transmission of handshake signals through the 


PXI STAR bus. The steps for establishing the connection of the self-adaptive serial link are as follows, as 
shown in Fig.2 (b). 


Step 1: The DAB sends K28.5 control code stream through the PXIe DSTARC bus, while the CCB 
keeps the PXI STAR bus at a low voltage level. 

Step 2: The CCB utilizes the ISERDES module to detect and locate the boundaries of the serial data 
stream for checking the recovery of the K-code. If the CCB fails to recover the K28.5 code, the system 
proceeds to Step 3; otherwise, it proceeds to Step 4. 

Step 3: The CCB dynamically adjusts the IDELAY chain length by adding a delay element to the 
IDELAY chain. Then the system proceeds to Step 1. 

Step 4: If a certain number of consecutive K28.5 control codes are received, the system proceeds to 
Step 5; otherwise, it returns to Step 3. 

Step 5: The CCB sets the PXI STAR to high voltage level, and the DAB sends a sequence of 
incrementing codes (0x00 to OxFF). Then the system proceeds to Step 6. The reason for sending 
incremental codes from 0x00 to OxFF is that the receiving end may be in the metastable state where 
it can correctly receive the K28.5 control code but not some other data. The incremental codes help 
detect if the receiving end is in a stable state. 

Step 6: If the incrementing code is correctly received by the CCB, the system proceeds to Step 7; 
otherwise, it proceeds to Step 3. 

Step 7: The initialization of the link is complete. The link is now ready for data transmission. The 
system will continue to transmit the K28.5 control code and keep the PXI_STAR signal high when 
the link is idle. 
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Fig.2 Data aggregation built on the self-adaptive serial link technique. (a) The composition of the data 
aggregation link. (b) The flow chart of the initialization of the self-adaptive serial link. (c) The data 
aggregation process between CCB and DABs. 

To accommodate the limited on-chip storage resources of the FPGA, this study adopts a time slicing 
approach for data aggregation. The data aggregation process between CCB and DABSs is depicted in Fig.2 (c). 
Once data acquisition is completed, the CCB sends a data read request. Next, each DAB reads the waveform 
data from the onboard DDR (double data rate) memory, where the waveforms data of over-threshold valid 
signals are preserved in time order. The waveform data are then further processed to extract the signal's 
amplitude, timestamp, and channel number information, which is next stored in a FIFO (first-in, first-out). The 
DAB reads the data corresponding to the next time slice from the FIFO and sends them to the CCB. 
Subsequently, the CCB stores the data from all 12 channels into a width conversion FIFO. This width 


conversion step ensures that the port transmission capacity of the 12:1 switch is consistent, preventing data 


congestion during the aggregation process. Finally, the CCB polls the width conversion FIFO to retrieve the 
data and writes the signal information into two separate FIFOs: one for the X-direction and one for the Y- 
direction. After completing the data aggregation of a time slice, the CCB proceeds with subsequent data 
processing operations. Once finished, it sends a data read request for the next time slice in preparation for the 


next round of processing. 


2.2 The fast sorting and coordinate matching technique based on timestamp to RAM address 


mapping 


Due to independent acquisition among DABs, the data aggregated into the X and Y direction FIFOs are 
time-disordered. Therefore, this study proposes a fast sorting technique based on the mapping between signal 
timestamps and random access memory (RAM) addresses, as described in Fig.3 (a). The timestamp of a signal 
is mapped to the address of an FPGA on-chip RAM, while the channel number and amplitude of the signal are 
the contents of the RAM storage. Fig.3 (a) also provides an intuitive understanding, where time-ordered signals 
on the time axis are rotated by 90 degrees, resulting in a structure that closely resembles RAM. 

The depth of the RAM which stores the data of a single time slice should be equal to the length of the time 
slice. The timestamp to RAM address mapping uses the lower N bits of the timestamp as the address of the 
RAM, where N and the depth of the RAM (Depth) follow a relationship of 2= Depth. This allows the N-bit 
timestamp to be directly mapped to the entire circular RAM address. The sorting module writes the data to the 
first-level RAM based on the mapped address. A RAM address unit corresponds to a timestamp unit of 10 ns, 
which is the sampling interval of the ADC. Considering FPGA resource utilization, the RAM depth is set to 
2048, so the time slice length is 20.48 us. The probability of two signal timestamps being the same is very low, 
so no other operations are needed to handle this situation. 

After sorting and merging processes in the X and Y directions, further coordinate matching operation is 
required. Coordinate matching follows two principles: time alignment and nearest matching. Time alignment 
means that the time difference between the timestamps of the X-direction and Y-direction signals should be 
smaller than the predefined time window. Nearest matching means that if two or more signals are found within 
the time window, the X and Y direction signals with the closest timestamps are matched. 

A typical coordinate matching process is as follows: For each X-direction signal, search for the Y-direction 
signal that has the smallest time difference. If the time difference between this pair of X and Y signals falls 
within the time window, generate a particle coordinate. Otherwise, the X-direction signal is considered as an 
isolated signal without a matching Y-direction signal. Then continue processing the next X-direction signal 
until all X signals have been traversed. For each X-direction signal, this coordinate matching process requires 
search through the Y-direction signals, resulting in a loop traversal operation. Therefore, the time complexity 
of this method is O(n’). 

This study proposes a fast coordinate matching technique based on the previous timestamp to RAM 
address mapping, as shown in Fig.3 (b). The merged signals are stored in the second-level RAM, with addresses 
still following the timestamp to RAM address mapping relationship. The X-direction and Y-direction signals 
share the same RAM address reference, which means they are aligned on the time axis. Therefore, when 


searching for a matching Y-direction signal for each X-direction signal, it is only necessary to expand a range 


of RAM addresses as the time window in the Y-direction RAM and find the nearest non-empty address within 
the time window. This address corresponds to the matched Y-direction signal. The coordinate matching for all 
signals can be completed by traversing through the second-level RAM of the X-direction. The reconstructed 
event information is then stored in a FIFO buffer and transmitted to the chassis controller via the PCIe 
(peripheral component interconnect express) bus. 

The fast sorting and coordinate matching technique based on the timestamp to RAM address mapping 
offers the advantages of low time complexity and simple implementation. On one hand, this technique avoids 
complex conventional algorithms like bubble sort and traversal search, and thus the time complexity is reduced 
from O(n’) to O(n). On the other hand, this technique streamlines sorting and searching into basic RAM read- 


write operations, and is relatively simple to be implemented on an FPGA. 
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Fig.3 The fast sorting and coordinate matching technique based on timestamp to RAM address mapping. (a) 
Sorting process. (b) Matching process. 


2.3 The precise start point merging technique utilizing a circular combined RAM 


According to the uTPC principle, after sorting operation, it is necessary to merge the groups of signals 


that have close channel numbers and timestamps into the starting signal of each group. When merging the 
signals at the end of each time slice, their starting signals may exist at the beginning of the next time slice. 

This study design a circular combined RAM structure to effectively solve this problem, as shown in Fig.4 
(a). When merging the signals at the tail of one RAM slice, the signals at the head of the other RAM slice can 
be read to determine whether mergence is needed. The merged signals are then written into the second-level 
RAM for subsequent coordinate matching operations. 

RAM is read and written by both the upstream and downstream modules. It is crucial to design a well- 
coordinated timing relationship to prevent conflicts of RAM read and write operation. To address this issue, 
dual-port on-chip RAMs are utilized. The read and write operations of the RAM are coordinated depending on 
the status signals from the upstream and downstream modules. The explanation below focuses on the first-level 
RAM involved in the sorting and merging modules, but the same principles also apply to the second-level RAM 
used in the merging and matching modules. 

The first-level circular RAM consists of sections A and B, and each section is corresponding to a single 
time slice. At the initial state when the RAM is empty, the sorting module writes data into sections A and B. It 
then sends a sorting completion signal to the merging module. After the initial state, every time the sorting 
module finishes writing a time slice of data into one section of the RAM, it sends a sorting completion signal 
to the merging module. Upon receiving the sorting completion signal, the merging module performs the 
merging operation on the other section of the RAM and sends a merging completion signal to the sorting module. 
It is important to note that the section of the RAM used for merging in each iteration is different from the 
section where the sorting module has just written. The two sections of RAM are used in a ping-pong-like 
manner, as depicted in Fig.4 (a). This process is repeated to complete the merging of signals for all time slices. 
Furthermore, taking advantage of the parallel computing capabilities of FPGA, the signal merging operations 
can be performed simultaneously in both the X direction and Y direction. Fig.4 illustrates only one direction of 
the merging operation, but the same operations are being carried out in the other direction as well. 

The number of readout strips hit by charged particles can be various, which depends on the particle's 
emission angle. Therefore, it is uncertain how many other signals need to be read when merging each signal. 
To address this issue, this study adopts an successive merging approach based on the circular combined RAM, 
as illustrated in Fig.4 (b). During the mergence of a signal, a time window is opened starting from this signal 
and extending towards the later time direction. Within the time window, only the first signal that closely matches 
the channel number of the current signal is sought. If one such signal is found, there is no need to check for 
other signals within the time window. The current signal is merged into the later signal, updating its amplitude 
as the sum of the two, and incrementing the merged channel count by one. The amplitude and the merged 
channel count information could provide support for further event selection. If no candidate is found to be 
merged within the time window, it indicates that the current signal is the starting signal, and the signal is written 
to the same address in the second-level RAM. 

Considering the presence of many empty addresses in RAM, it is necessary to skip these blank addresses 
to improve data processing efficiency. This applies not only to the merging module but also to the coordinate 
matching module. To address this, the study adopts a dynamic indexing register to mark whether each address 
in RAM contains valid data, enabling the blank addresses to be skipped. Taking the first-level RAM shown in 
Fig.4 as an example, the bit number of the indexing register is equal to the RAM depth, where each bit 


corresponds to one address in RAM. During the process of time sorting, if a specific address in RAM is written 
with data, the corresponding bit in the register is set to 1. While traversing the valid data in RAM to merge 
signals, the multiple branch selection statement (casex statement in verilog language) is used to quickly locate 
the first non-zero bit in the indexing register. This corresponds to the first non-empty storage address in RAM. 
Once the merging of the signal at the corresponding address is completed, the corresponding bit in the indexing 


register is set to 0. This process is repeated, and as the merging of signals in the entire time slice is completed, 


all bits in the indexing register are dynamically cleared to zero. 
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Fig.4 The precise start point merging technique utilizing a circular combined RAM. (a) The circular 
combined RAM structure for precisely merging signals at the boundary of two time slices. (b) The successive 
merging process. 


3. Tests and verification 


3.1 Self-adaptive serial link test 


The self-adaptive serial link is the foundation for position reconstruction, hence the test of the reliability 


of the link is needed. The performance test of the serial link is carried out from two perspectives: eye diagram 


and bit error rate (BER). 
The eye diagram of the digital signal reflects its overall reliability and quality [26]. The test result, as 


shown in Fig.5, indicates a clear eye diagram with a wide opening range. The eye width measures 
approximately 1400 ps, which indicates the excellent quality of the serial link. 

Further test of BER was carried out to verify if the method of dynamically adjusting the delay chain length 
can achieve accurate clock sampling of serial data, thus examining the feasibility of the self-adaptive serial link. 
In the BER test, the DAB sends a certain number of pseudo-random number sequences, while the CCB recovers 
the pseudo-random numbers and compares them with locally generated pseudo-random numbers in the same 
mode. According to the hypothesis testing theory in statistics, the relationship between link error rate and 
confidence level is shown in equation (1). In the equation, n and k represent the number of bits in the pseudo- 
random sequence and the number of detected errors, respectively. CL denotes the confidence level at which the 
link error rate is better than p [27]. The BER test process lasted for approximately 3 hours, during which a 
pseudo-random number sequence of 6.75x10!? bits was transmitted with the link rate set to 625 Mbps. Zero 
errors were detected in the test procedure. It can be inferred that the BER is better than 10°! at a confidence 
level of 99%, meeting the reliability standards of conventional high-speed serial links such as the PCIe bus. 
This illustrates that, despite the inherent lack of synchronization between the local receiver clock and the 
incoming serial data stream, the self-adaptive serial link protocol proposed in this study allows the receiver to 
dynamically adjust the length of the programmable delay chains. This, in turn, automatically fine-tunes the 
phase relationship between the serial data stream and the clock, ultimately enabling the asynchronous receiver 
clock to perform precise sampling of the serial data stream. In summary, the experimental results indicate the 
capability of the self-adaptive serial link to ensure stable and dependable data transmission. 

The notable advantages of the self-adaptive serial link lie in the simplicity of physical link connections 
and the ability to easily implement link protocols using common FPGA logic resources. Consequently, this 
addresses a constraint encountered in our study — the quantity limitation of backplane buses on the PXIe 
platform. In contrast, conventional synchronous parallel transmission methods necessitate dedicated clock 
signal paths and multiple parallel data links at both transmitter and receiver ends. Besides, traditional high- 
speed serial transmission methods based on Gigabit transceivers (GT) consume scarce GT core resources of 


FPGA, making it unsuitable for aggregating data across a dozen or more circuit boards. 
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Fig.5 The eye diagram of the self-adaptive serial link. 


3.2 The position resolution experiment 


The position resolution experiment was conducted with a standard Am-Be neutron source and a known 
semi-circular °LiF conversion film affixed to the Micromegas detector. Due to the relatively low intensity of 
the Am-Be neutron source, in this particular experiment, a thick (2 um) pure °LiF conversion film was utilized. 
The imaging result of the Am-Be neutron radiation field is shown in Fig.6 (a). The system's spatial resolution 
can be estimated by fitting the neutron count distribution at the semi-circle's diameter boundary. The image is 
further rotated by 45°, and neutron counts are accumulated along the diameter boundary of the semi-circle. 

The experimental distribution B(x’) of accumulated hits at the semicircle's diameter boundary is the 
convolution of the actual particle distribution T(x) and the system's Gaussian response function R(x,x’), as 
shown in equation (2) and equation (3) [28-29]. Positional resolution tests for micropattern gas detectors 
typically employ specific shaping components to adjust the spatial distribution T(x) of particles entering the 
detector into the expected distribution. Since neutrons are electrically neutral, the conversion film itself can 
serve as the shaping component. Narrow slits or blades are the most commonly used shaping components 
because the expected distributions T(x) of neutrons passing through them are simple 6 and step distributions, 
respectively. For simplicity in the experiment, an existing circular conversion film was divided into two halves, 
where the diameter of a semicircle can be regarded as a blade. Consequently, the actual distribution of incident 
particles at the diameter is conformed to a step distribution. Furthermore, the experimental distribution B(x’) is 
simplified as a cumulative Gaussian distribution, as shown in equation (4). By fitting the experimental data, as 
depicted in Fig.6 (b), the standard deviation (c) of the fit can be obtained, which indicates a spatial resolution 
of about 1.4 mm. As mentioned in the introduction, the conventional measurement methods mainly rely on 
point-by-point scanning with active detectors or spatial arrangement of passive detectors. The accuracy of these 
methods is about 5 mm to 1 cm [30-33]. In comparison, the method proposed in this study improves the 


measurement accuracy to 1.4 mm. 


B(x) = J TŒ- REx) dx (2) 
Asal (=x) 

R(x, x')= 3 

(x x’) a exp AG (3) 

B(x')= f R(x)dx (4) 


The high spatial resolution exhibited in the experiment results mainly arises from two key factors. 
Primarily, this study employs a Micromegas micropattern detector with a only 1.5-mm pitch between the 
readout strips. According to the principles of uniform distribution, it can be inferred that thes trip pitch 
determines the theoretical limit of the system spatial resolution to be approximately 0.43 mm (standard 
deviation). Secondarily, the FPGA-based hardware position reconstruction algorithm proposed in this paper is 
founded on the uTPC principle. As discussed in the introduction, the TPC principle takes the Micromegas 
detector as a micro time-projection chamber, enabling the reconstruction of the starting point of the particle's 
initial ionization drift path through the temporal order of multiple induced signals. Previous research has 
indicated that the uTPC principle offers superior position reconstruction accuracy compared to traditional 
centroid methods [19,20]. 
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Fig.6 The experiment results of position resolution. (a) The imaging result of Am-Be neutron radiation 
field. (b) The fitting result with the convolution function of step distribution and Gaussian distribution. 


3.3 The BNCT beam experiment 


The beam experiment is conducted on a dedicated BNCT reactor neutron source, which is named in- 
hospital neutron irradiator at China Institute of Atomic Energy, and the theoretical flux of the neutron beam is 
up to 10° n'em? s! [34]. Considering the high beam flux, an ultra-thin (30 nm) conversion film was employed 
in the experiment, and natural LiF rather than purified “LiF was adopted as the conversion material to minimize 
conversion efficiency. Figure 7 (a) displays a photograph of the experimental setup, and Fig.7 (b) shows the 
spatial distribution of thermal neutron components in the beam obtained with the conversion film. 

It can be observed that the profile of the neutron beam exhibits a circular profile with a diameter of 
approximately 12 cm, which conforms to the shape of the beam outlet. The evenly distributed light-colored 
grid points result from the mesh pillars of the Micromega detector. The spacing between the mesh pillars is 
about 10 mm, and the diameter is about 1 mm, which is consistent with the measurement results. The large- 
area “LiF film for the thermal neutron detector is composed of four 10 cm x 10 cm square films. Therefore, 
the slightly lower neutron counts at the junction of the "LiF films can be clearly seen, as well as the 20 cm x 
20 cm square boundary outside the beam profile. 

During the beam test, the position reconstruction results are obtained immediately after data acquisition is 
completed. Compared to the measurement times in the order of hours for traditional point-by-point scanning 
with active detectors or spatial arrangement of passive detectors, this study improves the equipment's 
practicality and measurement efficiency. 

Additionally, the spatial distribution of neutron beam flux in BNCT is concerned with the relative 
distribution rather than the absolute value of neutron flux. This parameter will serve as a relative coefficient 
input into the future BNCT treatment planning system, and thus practical numerical values are not presented 
here. If accurate absolute fluence values for specific points can be measured in the future, in conjunction with 
the spatial distribution measurement method demonstrated in this paper for relative fluence, a complete two- 
dimensional absolute fluence distribution can be obtained. Moreover, due to the challenge of producing large, 
uniformly ultra-thin(30 nm) ™'LiF film, preliminary tests were conducted using four 10 cm x 10 cm square 


films pieced together. Besides, uniformity calibration of the films were not performed in the preliminary tests, 
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making it hard to obtain comprehensive and precise uniformity values of the beam flux at this stage. Since the 
nuclear reactor are normally shut down, and the availability of beam time for experiments is quite limited. 
Therefore, these limitations will be addressed further in future system-level work. 

In summary, although there are some imperfections in the test results, the preliminary beam experiment 
has already confirmed the online rapid measurement capability of the FPGA-based particle position 
reconstruction method proposed in this paper. Furthermore, the details of the beam experiment results, along 


with the results of position resolution experiment, indicate the system's excellent spatial resolution. 
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Fig.7 (a) A photograph of neutron beam experiment. (b)The spatial distribution of thermal neutron 
obtained with the "LiF conversion film. 


4. Conclusion 


This paper proposes an FPGA-based online position reconstruction method rooted in the uTPC principle, 
effectively eliminating the need for lengthy offline data processing, thereby enhancing equipment practicality 
and improving measurement efficiency. This study contributes to advancing the application of Micromegas 
detectors in measuring the flux spatial distribution of BNCT neutron beams, consequently forming a rapid and 
precise measurement technique. Besides, in response to the difficulty faced by FPGA in sorting and searching 
operations, this study presents a rapid sorting and searching technique based on timestamp to RAM address 
mapping. In comparison to traditional approaches such as bubble sorting and traversal searching, the proposed 
method simplifies the design of logic firmware and reduces time complexity from O(n’) to O(n). In addition, 
given the prevalence of timestamps as a common piece of information in physics experiments, the proposed 
method can provide valuable insights and references for researchers who need to execute timestamp-related 
algorithms within the FPGA. 
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